Okay, so let's get started.
Hello everyone, welcome back.
So last time we looked into convolutional neural networks and I want to start with a
quick recap.
So if you have an image processing program like Photoshop or GIMP, then you know that
you can apply filters to an image.
So as to enhance the edges or to blur the image.
And last time I told you that basically a convolutional network exactly does the same
except that the filter can be learned.
And so you would provide an image that you feed into your neural network and then layer
by layer different filters would be applied and it could look like the situation shown
here where you imagine that you scan across the pixels of an image and you always collect
information from all the neighboring pixels, apply a filter that is you just multiply,
you weight the values, the pixel values and then you combine them into the result that
is written into the pixel of the outgoing image.
So that's the idea behind a convolutional neural network.
It's called convolutional because mathematically the operation we are applying is just a convolution
here in a discrete space defined by the grid of pixels and it is extremely useful simply
because the amount of information you have to store in these filters is much much smaller
than if you were to process the same size of images using a fully connected neural network.
So if you have n pixels in an image, then for a fully connected neural network that
maps an n pixel image to another n pixel image, you would have n squared weights,
n squared connections and you would have to store all these weights and train them.
Whereas for a convolutional neural network actually the number of parameters does not
even scale with n, it is just given by the number of pixels that your filter encompasses
and maybe that's a 5 by 5 filter, so you would have 25 values of the trainable weights to
store.
And so that not only means that the memory consumption is much less, but it also means
that training is much more efficient because in a sense what the neural network learns
from features that it observes, say in the lower left hand corner of the image, it will
also later be able to apply to the upper right hand corner of the image.
So in a sense a single training image already provides training information as if you had
many many small patches and apply your neural network to those.
So it's really efficient.
So let's jump directly to this example of how a usual convolutional neural network would
look like in practice.
It's actually a combination of a convolutional neural network consisting of several stages
plus subsequent stages of a standard, default, fully connected neural network that do the
final stages of processing.
So in the example that I show here we have the input image, let's say it is a monochrome
image, so it has only one channel in that sense, one color channel, and then it can
be mapped by these convolutional filters into an image that maybe in this case has several
channels and so if these channels are input channels they would possibly correspond to
a color image like red, green, blue, but further up in the stages they can represent anything
and it's actually not you who decides what these channels mean, but it's the neural network
via the training process that makes choices as to what these channels mean.
And if you're lucky you can later try to analyze what is going on.
And then we also learned last time that not only in image processing but also for this
kind of neural network it's smart to play around with the resolution.
So for example you can decrease the resolution by always averaging over small patches of
Presenters
Zugänglich über
Offener Zugang
Dauer
01:26:54 Min
Aufnahmedatum
2024-06-06
Hochgeladen am
2024-06-07 09:59:03
Sprache
en-US